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Efficient Coroutine Generation of Constrained Gray Sequences 

Donald E. Knuth and Frank Ruskey 
(dedicated to the memory of Ole-Johan Dahl) 

Abstract. We study an interesting family of cooperating coroutines, which is 
able to generate all patterns of bits that satisfy certain fairly general ordering 
constraints, changing only one bit at a time. (More precisely, the directed graph 
of constraints is required to be cycle-free when it is regarded as an undirected 
graph.) If the coroutines are implemented carefully, they yield an algorithm that 



needs only a bounded amount of computation per bit change, thereby solving an 
open problem in the field of combinatorial pattern generation. 

Much has been written about the transformation of procedures from recursive to iterative 
form, but little is known about the more general problem of transforming coroutines into 
O ; equivalent programs that avoid unnecessary overhead. The present paper attempts to 
take a step in that direction by focusing on a reasonably simple yet nontrivial family 
of cooperating coroutines for which significant improvements in efficiency are possible 
when appropriate transformations are applied. The authors hope that this example will 
inspire other researchers to develop and explore the potentially rich field of coroutine 
q ' transformation. 

Coroutines, originally introduced by M. E. Conway [2], are analogous to subroutines, 
but they are symmetrical with respect to caller and callee: When coroutine A invokes 
coroutine B, the action of A is temporarily suspended and the action of B resumes where 
\ B had most recently left off. Coroutines arise naturally in producer /consumer situations 
or multipass processes, analogous to the "pipes" of UNIX, when each coroutine transforms 
an input stream to an output stream; a sequence of such processes can be controlled in 



o . 

such a way that their intermediate data files need not be written in memory. (See, for 

O 

O 



example, Section 1.4.2 of [9].) 

The programming language SIMULA 67 [3] introduced support for coroutines in terms 
of fundamental operations named call, detach, and resume. Arne Wang and Ole-Johan 
Dahl subsequently discovered [20] that an extremely simple computational model is able to 
accommodate these primitive operations. Dahl published several examples to demonstrate 
their usefulness in his chapter of the book Structured Programming [4]; then M. Clint [1] 
and O.-J. Dahl [6] began to develop theoretical tools for formal proofs of coroutine cor- 
rectness. 

Another significant early work appeared in R. W. Floyd's general top-down parsing 
algorithm for context-free languages [8], an algorithm that involved "imaginary men who 
are assumed to automatically appear when hired, disappear when fired, remember the 
names of their subordinates and superiors, and so on." Floyd's imaginary men were es- 
sentially carrying out coroutines, but their actions could not be described naturally in any 
programming languages that were available to Floyd when he wrote about the subject in 
1964, so he presented the algorithm as a flow chart. Ole-Johan Dahl later gave an elegant 
implementation of Floyd's algorithm using the features of SIMULA 67, in §2.1.2 of [5]. 

The coroutine concept was refined further during the 1970s; see, for example, [19] 
and the references cited therein. But today's programming languages have replaced those 
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ideas with more modern notions such as "threads" and "closures," which (while admirable 
in themselves) support coroutines only in a rather awkward and cumbersome manner. 
The simple principles of old-style coroutines, which Dahl called quasi-parallel processes, 
deserve to be resurrected again and given better treatment by the programming languages 
of tomorrow. 

In this paper we will study examples for which a well-designed compiler could trans- 
form certain families of coroutines into optimized code, just as compilers can often trans- 
form recursive procedures into iterative routines that require less space and/or time. 

The ideas presented below were motivated by applications to the exhaustive generation 
of combinatorial objects. For example, consider a coroutine that wants to look at all 
permutations of n elements; it can call repeatedly on a permutation-generation coroutine 
to produce the successive arrangements. The latter coroutine repeatedly forms a new 
permutation and calls on the former coroutine to inspect the result. The permutation 
coroutine has its own internal state — its own local variables and its current location in 
an ongoing computational process — so it does not consider itself to be a "subroutine" of 
the inspection coroutine. The permutation coroutine might also invoke other coroutines, 
which in turn are computational objects with their own internal states. 

We shall consider the problem of generating all n-tuples a\a<2 ■ ■ ■ a n of Os and Is with 
the property that a-, < at whenever j ' — > k is an arc in a given directed graph. Thus 
ctj = 1 implies that must also be 1; if = 0, so is aj. These n-tuples are supposed 
to form a "Gray path," in the sense that only one bit aj should change at each step. For 
example, if n = 3 and if we require a± < a3 and a 2 < CI3, five binary strings a\a<ia% satisfy 
the inequalities, and one such Gray path is 

000, 001, 011, 111, 101. 

The general problem just stated does not always have a solution. For example, suppose 
the given digraph is 




so that the inequalities are a\ < a<i and a<z<a\\ then we are asking for a way to generate 
the tuples 00 and 11 by changing only one bit at a time, and this is clearly impossible. 
Even if we stipulate that the digraph of inequalities should contain no directed cycles, we 
might encounter an example like 

in which the Gray constraint cannot be achieved; here the corresponding 4-tuples 

0000, 0001, 0011, 0101, 0111, 1111 

include four of even weight and two of odd weight, but a Gray path must alternate between 
even and odd. Reasonably efficient methods for solving the problem without Grayness are 
known [17, 18], but we want to insist on single-bit changes. 
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We will prove constructively that Gray paths always do exist if we restrict consider- 
ation to directed graphs that are totally acyclic, in the sense that they contain no cycles 
even if the directions of the arcs are ignored. Every component of such a graph is a free 
tree in which a direction has been assigned to each branch between two vertices. Such 
digraphs are called spiders, because of their resemblance to arachnids: 



(In this diagram, as in others below, we assume that all arcs are directed upwards. More 
complicated graph-theoretical spiders have legs that change directions many more times 
than real spider legs do.) The general problem of finding all a\ . . . a n such that aj < 
when j ; — > k in such a digraph is formally called the task of "generating the order ideals of 
an acyclic poset" ; it also is called, informally, "spider squishing." 

Sections 1-3 of this paper discuss simple examples of the problem in preparation for 
Section 4, which presents a constructive proof that suitable Gray paths always exist. The 
proof of Section 4 is implemented with coroutines in Section 5, and Section 6 discusses the 
nontrivial task of getting all the coroutines properly launched. 

Section 7 describes a simple technique that is often able to improve the running 
time. A generalization of that technique leads in Section 8 to an efficient coroutine-free 
implementation. Additional optimizations, which can be used to construct an algorithm 
for the spider-squishing problem that is actually loopless, are discussed in Section 9. (A 
loopless algorithm needs only constant time to change each n-tuple to its successor.) 

Section 10 concludes the paper and mentions several open problems connected to 
related work. 

1. The unrestricted case. Let's begin by imagining an array of friendly trolls called 
Ti, T2, . . . , T n . Each troll carries a lamp that is either off or on; he also can be either 
awake or asleep. Initially all the trolls are awake, and all their lamps are off. 

Changes occur to the system when a troll is "poked," according to the following simple 
rules: If Tfc is poked when he is awake, he changes the state of his lamp from off to on or 
vice versa; then he becomes tired and goes to sleep. Later, when the sleeping Tfc is poked 
again, he wakes up and pokes his left neighbor T^-i, without making any change to his 
own lamp. (The leftmost troll Ti has no left neighbor, so he simply awakens when poked.) 

At periodic intervals an external driving force D pokes the rightmost troll T n , initiating 
a chain of events that culminates in one lamp changing its state. The process begins as 
follows, if we use the digits and 1 to represent lamps that are respectively off or on, and 
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if we underline the digit of a sleeping troll: 



0000 


Initial state 




0001 


D pokes T n 




0011 


D pokes T n , 


who 


0010 


D pokes T n 




0110 


D pokes T n , 


who 


0111 


D pokes T n 




0101 


D pokes T n , 


who 



wakes up and pokes T n _i 
pokes T n _!, who pokes T n _ 2 
pokes T n _i 



The sequence of underlined versus not-underlined digits acts essentially as a binary counter. 
And the sequence of digit patterns, in which exactly one bit changes at each step, is a Gray 
binary counter, which follows the well-known Gray binary code; it also corresponds to the 
process of replacing rings in the classic Chinese ring puzzle [12]. Therefore the array of 
trolls solves our problem of generating all n-tuples a\a<i ■ ■ -a n , in the special case when 
the spider digraph has no arcs. (This troll-oriented way to generate Gray binary code was 
presented by the first author in a lecture at the University of Oslo in October, 1972 [10].) 

During the first 2 n steps of the process just described, troll T n is poked 2 n times, troll 
T n _! is poked 2 n_1 times, . . . , and troll Ti is poked twice. The last step is special because 
Ti has no left neighbor; when he is poked the second time, all the trolls wake up, but no 
lamps change. The driver D would like to know about this exceptional case, so we will 
assume that T n sends a message to D after being poked, saying 'true' if one of the lamps 
has changed, otherwise saying 'false'. Similarly, if 1 < k < n, Tf. will send a message to 
T k+1 after being poked, saying 'true'' if and only if one of the first k lamps has just changed 
state. 

These hypothetical trolls T\ , . . . , T n correspond to n almost-identical coroutines 
poke[l], poke[n], whose actions can be expressed in an ad hoc Algol-like language 
as follows: 



Boolean coroutine poke[k]; 
while true do begin 

awake: a[k] := 1 — a[k]; return true; 

asleep: if k > 1 then return poke[k — 1] else return false; 
end. 



Coroutine poke[k] describes the action of implicitly retaining its own state of wakeful- 
ness: When poke[k] is next activated after having executed the statement 'return true' 
it will resume its program at label 'asleep'; and it will resume at label 'awake' when it is 
next activated after 'return poke[k — 1]' or 'return false\ 

In this example and in all the coroutine programs below, the enclosing 'while true do 
begin (P) end' merely says that program (P) should be repeated endlessly; all coroutines 
that we shall encounter in this paper are immortal. (This is fortunate, because Dahl [6] 
has observed that proofs of correctness tend to be much simpler in such cases.) 

Our coroutines will also always be "ultra-lightweight" processes, in the sense that 
they need no internal stack. They need only remember their current positions within their 
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respective programs, along with a few local variables in some cases, together with the global 
"lamp" variables a[l], . . . , a[n\. We can implement them using a single stack, essentially as 
if we were implementing recursive procedures in the normal way, pushing the address of a 
return point within A onto the stack when coroutine A invokes coroutine B, and resuming A 
after B executes a return. (Wang and Dahl [20] used the term "semicoroutine" for this 
special case. We are, however, using return statements to return a value, instead of using 
global variables for communication and saying 'detach' as Wang and Dahl did.) The 
only difference between our coroutine conventions and ordinary subroutine actions is that 
a newly invoked coroutine always begins at the point following its most recent return, 
regardless of who had previously invoked it. No coroutine will appear on the execution 
stack more than once at any time. 

Thus, for example, the coroutines poke[l] and poke[2] behave as follows when n = 2: 



00 


Initial state 




01 


poke [2] 


= true 




11 


poke [2] 


= poke[l] 


= true 


10 


poke [2] 


= true 




10 


poke [2] 


= poke[l] 


= false 


11 


poke [2] 


= true 




01 


poke [2] 


= poke[l] 


= true 


00 


poke [2] 


= true 




00 


poke [2] 


= poke [1] 


= false 



The same cycle will repeat indefinitely, because everything has returned to its initial state. 

Notice that the repeating cycle in this example consists of two distinct parts. The 
first half cycle, before false is returned, generates all two-bit patterns in Gray binary order 
(00,01,11,10); the other half generates those patterns again, but in the reverse order 
(10,11,01,00). Such behavior will be characteristic of all the coroutines that we shall 
consider for the spider-squishing problem: Their task will be to run through all n-tuples 
a\...a n such that aj < for certain given pairs (j, k), always returning true until all 
permissible patterns have been generated; then they are supposed to run through those 
n-tuples again in reverse order, and to repeat the process ad infinitum. 

Under these conventions, a driver program of the following form will cycle through 
the answers, printing a line of dashes between each complete listing: 

( Create all the coroutines ) ; 

( Put each lamp and each coroutine into the proper initial state ) ; 
while true do begin 

for k := 1 step 1 until n do write (a[k]); 

write (newline); 

if not root then write (" ",newline); 

end. 

Here root denotes a coroutine that can potentially activate all the others; for example, 
root is poke[n] in the particular case that we've been considering. In practice, of course, 
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the driver would normally carry out some interesting process on the bits a\ . . . a n , instead 
of merely outputting them to a file. 

The fact that coroutines poke [1] , . . . , poke [n] do indeed generate Gray binary code is 
easy to verify by induction on n. The case n = 1 is trivial, because the outputs will clearly 



1 





and so on. On the other hand if n > 1, assume that the successive contents of a\ . . .a n _i 
are cto, cti, «2, • • • when we repeatedly invoke poke [n — 1], assuming that ao = . . . and 
that all coroutines are initially at the label 'awake'; assume further that false is returned 
just before a m when m is a multiple of 2 n_1 , otherwise the returned value is true. Then 
repeated invocations of poke [n] will lead to the successive lamp patterns 

Q!o0, «ol, «il, «i0, a^O, CK2I, 

and false will be returned after every sequence of 2 n outputs. These are precisely the 
patterns of n-bit Gray binary code, alternately in forward order and reverse order. 

2. Chains. Now let's go to the opposite extreme and suppose that the digraph of con- 
straints is an oriented path or chain, 

1^2^ >n. 

In other words, we want now to generate all n-tuples a\02 ■ ■ ■ a n such that 

< ax < a2 < • • • < a n < 1, 

proceeding alternately forward and backward in Gray order. Of course this problem is 
trivial, but we want to do it with coroutines so that we'll be able to tackle more difficult 
problems later. 

Here are some coroutines that do the new job, if the driver program initiates action 
by invoking the root coroutine bump[l]: 

Boolean coroutine bump[k]; 
while true do begin 

awakeO: if k < n then while bump[k + 1] do return true; 

a[k] := 1; return true; 
asleepl: return false; comment a,k . . . a n = 1 . . . 1; 
awakel: a[k] := 0; return true; 

asleepO: if k < n then while bump[k + 1] do return true; 
return false; comment . . . a n = . . . 0; 
end. 
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For example, the process plays out as follows when n = 3: 



000 


Initial state 




123 


001 


bump [1] 


= bump[2] = 


bump [3] = true 


123 


on 


bump [1] 


= bump[2] = 


true , bump [3] = false 


12 


111 


bump [1] 


= true , bump [2] = false 


1 


111 


bump [1] 


= false 




1 


Oil 


bump [1] 


= true 




12 


001 


bump [1] 


= bump[2] = 


true 


123 


000 


bump [1] 


= bump[2] = 


bump[3] = true 


123 


000 


bump [1] 


= bump[2] = 


bump [3] = false 


123 



Each troll's action now depends on whether his lamp is lit as well as on his state of 
wakefulness. A troll with an unlighted lamp always passes each bump to the right, without 
taking any notice unless a false reply comes back. In the latter case, he acts as if his lamp 
had been lit — namely, he either returns false (if just awakened), or he changes the lamp, 
returns true, and nods off. The Boolean value returned in each case is true if and only if 
a lamp has changed its state during the current invocation of bump[k]. 

(Note: The numbers '123', '123', ... at the right of this example correspond to an 
encoding that will be explained in Section 8 below. A similar column of somewhat in- 
scrutable figures will be given with other examples we will see later, so that the principles 
of Section 8 will be easier to understand when we reach that part of the story. There is no 
need to decipher such notations until then; all will be revealed eventually.) 

The dual situation, in which all inequalities are reversed so that we generate all 
a\tt2 ■ ■ ■ a n with 

1 > «1 > «2 > ' ' ' > tt n > 0, 

can be implemented by interchanging the roles of and 1 and starting the previous sequence 
in the midpoint of its period: 

Boolean coroutine cobump[k\; 
while true do begin 

awakeO: a[k] := 1; return true; 

asleepl: if k < n then while cobump[k + 1] do return true; 

return false; comment ajt . . . a n = 1 ... 1; 
awakel: if k < n then while cobump[k + 1] do return true; 

a[k] := 0; return true; 
asleepO: return false; comment . . . a n = . . . 0; 

end. 

A mixed situation in which the constraints are 

< a n < a n -i < ■ ■ ■ < a m+ i < a x < a 2 < • • ■ < a m < 1 

is also worthy of note. Again the underlying digraph is a chain, and the driver repeatedly 
bumps troll T±; but when 1 < m < n, the coroutines are a mixture of those we've just 
seen: 



7 



Boolean coroutine mbump[k]; 
while true do begin 

awakeO: if k < m then while mbump[k + 1] do return true; 

a[k] := 1; return true; 
asleepl: if m < k A k < n then while mbump[k+l]) do return true; 

ii k = I A m < n then while mbump[m+l]) do return true; 

return false; 

awakel: if m < k A k < n then while mbump[k+l]) do return true; 

if = 1 A m < n then while mbump[m+l]) do return true; 

a[k] := 0; return true; 
asleepO: if k < m then while mbump[k + 1] do return true; 

return false; 

end. 

The reader is encouraged to simulate the mbump coroutines by hand when, say, m = 2 
and n = 4, in order to develop a better intuition about coroutine behavior. Notice that 
when m ~ |n, signals need to propagate only about half as far as they do when m = 1 or 
m = n. 

Still another simple but significant variant arises when several separate chains are 
present. The digraph might, for example, be 




in which case we want all 6-tuples of bits a\ . . . with a± < and < a$ < a§. In 
general, suppose there is a set of endpoints E = {ei, . . . , e m } such that 

1 = ei < • • • < e m < n, 

and we want 

cifc G {0, 1} for 1 < k < n; ctk-i < for k ^ E. 

(The set E is {1,3,4} in the example shown.) The following coroutines ebump[k], for 
1 < k < n, generate all such n-tuples if the driver invokes e6wmp[e m ]: 

Boolean coroutine ebump[k]; 
while true do begin 

awakeO: if k + 1 ^ E U {n + 1} then while ebump[k + 1] do return true; 

a[k] := 1; return true; 
asleepl: if k E E \ {1} then return ebump[k'] else return false; 
awakel: a[k] := 0; return true; 

asleepO: if k + 1 ^ E U {n + 1} then while ebump[k + 1] do return true; 
if k e E \ {1} then return ebump[k'] else return false; 
end. 
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Here k' stands for ej-i when k = ej and j > 1. These routines reduce to poke when 
E 1 = {1, 2, . . . , n} and to bump when E 1 = {1}. If E = {1, 3, 4}, they will generate all 24 
bit patterns such that cii < a 2 and < as < ag in the order 

000000, 000001, 000011, 000111, 001111, 001011, 001001, 001000, 
011000, 011001, OllOU, 011111, 010111, 010011, 010001, 010000, 
110000, 110001, 110011, 110111, 111111, 111011, 111001, 111000; 

then the sequence will reverse itself: 

111000, 111001, 111011, 111111, 110111, 110011, 110001, 110000, 
010000, 010001, 010011, 010111, 011111, 011011, 011001 011000, 
001000, 001001, 001011, 001111, 000111, 000011, 000001, 000000. 

In our examples so far we have discussed several families of cooperating coroutines and 
claimed that they generate certain n-tuples, but we haven't proved anything rigorously. A 
formal theory of coroutine semantics is beyond the scope of this paper, but we should at 
least try to construct a semi-formal demonstration that ebump is correct. 

The proof is by induction on \E\, the number of chains. If \E\ = 1, ebump[k] reduces 
to bump[k], and we can argue by induction on n. The result is obvious when n = 1. If 
n > 1, suppose repeated calls on bump[2] cause a 2 . . . a n to run through the (n — l)-tuples 
«o, ai, «2, • • • , where bump[2] is false when it produces a t = at-i- Such a repetition will 
occur if and only if t is a multiple of n, because n is the number of distinct (n — l)-tuples 
with d2 < ■ ■ ■ < a n . We know by induction that the sequence has reflective symmetry: 
ctj = a2n-i-j for < j < n. Furthermore, aj + 2n = ctj for all j > 0. To complete the 
proof we observe that repeated calls on bump[l] will produce the n-tuples 

0a , 0ai, . . . , 0a n _i, la n , 
la n , Qa n , Qa n+ i, Qa 2 n-i, 
0a 2 n, 0a 2 n+i, • •-, 0a 3n _i, la 3n , 

and so on, returning false every (n + l) st step as desired. 

If \E\ > 1, let E = {ei, . . . , e m }, so that e' m = e m _i, and suppose that repeated calls 
on ebump[e m -i] produce the (e m — l)-tuples «o, cti, «2, .... Also suppose that calls on 
ebump[e m ] would set the remaining bits a Cm . . . a n to the (n + 1 — e m )-tuples /3o, Pi, @2, 
. . . , if E were empty instead of {ei, . . . , e m }; this sequence (3o, Pi, /3 2 , ... is like the output 
of bump. The a and ft sequences are periodic, with respective periods of length 2M and 
2N for some M and A^; they also have reflective symmetry ctj = « 2 m-i-j, Pk = P2N-i-k- 
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It follows that ebump [e m ] is correct, because it produces the sequence 
7o>7i>72, • • • = «oA)> «o/3i, a /3 N - U 

OLM-lP(M-l)N, CtM-lP(M-l)N+l, •••7 OLM-iPmN-1, 
®m(3mN, (XmPmN+1, «M/5(M+l)iV-l) 

«2M-l/5(2M-l)iV) «2M-l/5(2M-l)AT+l7 •••7 «2M-l/52MAT-l , ••• 

which has period length 2MN and satisfies 

lNj+k = Ctj0Nj+k = OL2M-l-j(^2MN-l-Nj-k = "f2M N-l-N j-k 

for < j < M and < k < N. 

The patterns output by ebump are therefore easily seen to be essentially the same as 
the so-called reflected Gray paths for radices + 1 — e\, . . . , e m + 1 — e m _i, n + 2 — e m 
(see [12]); the total number of outputs is 

(e 2 + 1 - ei) . . . (e m + 1 - e m _i)(n + 2 - e m ). 
3. Ups and downs. Now let's consider a "fence" digraph 




which leads to n-tuples that satisfy the up-down constraints 

fli < a,2 > 03 < a4 > • • • . 

A reasonably simple set of coroutines can be shown to handle this case, rooted at nudge [1] : 

Boolean coroutine nudge [k] ; 
while true do begin 
awakeO: if k' < n then while nudge[k'] do return true] 

a[k] := 1; return true; 
asleepl: if k" < n then while nudge [k"\ do return true; 
return false; 

awakel: if k" < n then while nudge[k"] do return true; 

a[k] := 0; return true; 
asleepO: if k' < n then while nudge[k'] do return true; 
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return false; 
end. 



Here (k f , k") = (k + 1, k + 2) when k is odd, (k + 2, k + 1) when /c is even. But these 
coroutines do not work when they all begin at 'awakeO' with a\ai . . . a n = 00 ... 0; they 
need to be initialized carefully. For example, when n = 6 it turns out that exactly eleven 
patterns of odd weight need to be generated, and exactly ten patterns of even weight, so 
a Gray path cannot begin or end with an even- weight pattern such as 000000 or 111111. 
One proper starting configuration is obtained if we set a± . . . a n to the first n bits of the 
infinite string 000111000111 . . ., and if we start coroutine nudge [k] at 'awakeO' if ak = 0, 
at 'awakel' if = 1. For example, the sequence of results when n = 4 is 



Again the cycle repeats with reflective symmetry; and again, some cryptic notations appear 
that will be explained in Section 8. The correctness of nudge will follow from results we 
shall prove later. 

4. The general case. We have seen that cleverly constructed coroutines are able to gen- 
erate Gray paths for several rather different special cases of the spider-squishing problem; 
thus it is natural to hope that similar techniques will work in the general case when an 
arbitrary totally acyclic digraph is given. The spider 



0001 
0000 
0100 
0101 
0111 
1111 
1101 
1100 
1100 
1101 
1111 
0111 
0101 
0100 
0000 
0001 
0001 



Initial configuration 

nudge[l] = nudge[2] = nudge[4] = true 

nudge[l] = nudge[2] = true, nudge[4] = false 

nudge[l] = nudge[2] = nudge[3] = nudge[4] = true 

nudge[l] = nudge[2] = nudge[3] = true, nudge[4] = false 

nudge[l] = true, nudge[2] = nudge[3] = false 

nudge[l] = nudge[3] = true 

nudge[l] = nudge[3] = nudge[4] = true 

nudge [1] = nudge [3] = nudge [4] = false 

nudge[l] = nudge[3] = nudge[4] = true 

nudge[l] = nudge[3] = true, nudge[4] = false 

nudge [1] = true , nudge [3] = false 

nudge[l] = nudge[2] = nudge[3] = true 

nudge[l] = nudge[2] = nudge[3] = nudge[4] = true 

nudge[l] = nudge[2] = true, nudge[3] = nudge[4] = false 

nudge[l] = nudge[2] = nudge[4] = true 

nudge [1] = nudge [2] = nudge [4] = false 



124 

124 

1234 

1234 

123 

13 

134 

134 
134 

134 

13 

123 

1234 

1234 

124 

124 
124 
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illustrates most of the complications that might face us, so we shall use it as a running 
example. In general we shall assume that the vertices have been numbered in preorder, as 
defined in [9, Section 2.3.2], when the digraph is considered to be a forest (ignoring the 
arc directions) . This means that the smallest vertex in each component is the root of that 
component, and that all vertex numbers of a component are consecutive. Furthermore, 
the children of each node are immediately followed in the ordering by their descendants. 
The descendants of each node k form a subspider consisting of nodes k through scope (fc), 
inclusive; we shall call this "spider k." For example, spider 2 consists of nodes {2, 3,4, 5}, 
and scope(2) = 5. Our sample spider has indeed been numbered in preorder, because it 
can be drawn as a properly numbered tree with directed branches: 




The same spider could also have been numbered in many other ways, because any vertex 
of the digraph could have been chosen to be the root, and because the resulting trees can 
be embedded several ways into the plane by permuting the children of each family. 

Assume for the moment that the digraph is connected; thus it is a tree with root 1. 
A nonroot vertex x is called positive if the path from 1 to i ends with an arc directed 
towards x, negative if that path ends with an arc directed away from x. Thus the example 
spider has positive vertices {2, 3, 5, 6, 9} and negative vertices {4, 7, 8}. 

Let us write x — >* y if there is a directed path from x to y in the digraph. Removing all 
vertices x such that x — >* 1 disconnects the graph into a number of pieces having positive 
roots; in our example, the removal of {1,8} leaves three components rooted at {2,6,9}. 
We call these roots the positive vertices near 1, and we denote that set by U\. Similarly, 
the negative vertices near 1 are obtained when we remove all vertices y such that 1 ^* y; 
the set of resulting roots, denoted by V\, is {4, 7,8} in our example, because we remove 
{1,2,3,5,6}. 

The relevant bit patterns a\ . . . a n for which a\ = are precisely those that we obtain 
if we set aj = whenever j — >* 1 and if we supply bit patterns for each subspider rooted 
at a vertex of U\. Similarly, the bit patterns for which a\ = 1 are precisely those we obtain 
by setting a/- = 1 whenever 1 — >* k and by supplying patterns for each subspider rooted 
at a vertex of V±. Thus if rik denotes the number of bit patterns for spider k, the total 
number of suitable patterns ai . . .a n is flust/i Hu + n«evi Uv - 

The sets Uk and Vk of positive and negative vertices near k are defined in the same 
way for each spider k. 

Every positive child of k appears in Uk, and every negative child appears in Vk- These 
are called the principal elements of Uk and Vk- Every nonprincipal member of Uk is a 
member of U v for some unique principal vertex v of Vk- Similarly, every nonprincipal 
member of Vk is a member of V u for some unique principal vertex u of Uk- For example, 
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the principal members of U\ are 2 and 6; the other member, 9, belongs to Ug, where 8 is 
a principal member of V\ . 

We will prove that the bit patterns a± . . . a n can always be arranged in a Gray path 
such that bit a± begins at and ends at 1, changing exactly once. By induction, such 
paths exist for the n u patterns in each spider u for u G U\. And we can combine such 
paths into a single path that passes through all of the n^ec/i nu wa Y s to combine those 
patterns, using a reflected Gray code analogous to the output of ebump in Section 3 above. 
Thus, if we set = for all k such that k ^* 1, we get a Gray path Pi for all suitable 
patterns with a± = 0. Similarly we can construct a Gray path Q\ for the rLevi n v suitable 
patterns with a\ = 1. Thus, all we need to do is prove that it is possible to construct Pi 
and Q\ in such a way that the last pattern in Pi differs from the first pattern of Qi only 
in bit a±. Then G\ = (Pi, Q\) will be a suitable Gray path that solves our problem. 

For example, consider the subspiders for JJ\ = {2, 6, 9} in the example spider. An 
inductive construction shows that they have respectively (n2,n6,ng) = (8,3,2) patterns, 
with corresponding Gray paths 

G 2 = 0000, 0001, 0101, 0100, 0110, 0111, 1111, 1101; 
G 6 = 00,10,11; 
G 9 = 0,1. 

We obtain 48 patterns Pi by setting a± = cig = and using G2 for 02030405, Gq for 
O6O7, and Gq for 09, taking care to end with 02 = a$ = 1. Similarly, the subspiders for 
V\ = {4, 7, 8} have (714, 77,7, n%) = (2, 2, 3) patterns, and paths 

G 4 = 0, 1; 
G 7 = 0, 1; 
G 8 = 00,01,11. 

We obtain 12 patterns Qi by setting ai = 02 = 03 = 05 = ciq = 1 and using G 4 for 04, G7 
for 07, and G$ for ogOg, taking care to begin with o§ = 0. Combining these observations, 
we see that Pi should end with 011011100, and Qi should begin with 111011100. 

In general, the last element of Pt and the first element of Qk can be determined as 
follows: For all children j of k, set cij . . .a scope (j) to the last element of the previously 
computed Gray path Gj if j is positive, or to the first element of Gj if j is negative. 
Then set = in P&, = 1 in Qk- It is easy to verify that these rules make cij = 
whenever j ^* k, and aj = 1 whenever k — j, for all j such that k < j < scope (k). A 
reflected Gray code based on the paths G u for u G Uk can be used to construct P& ending 
at the transition values, having aj. = 0; and Qk can be constructed from those starting 
values based on the paths G v for v G T4, having = 1. Thus we obtain a Gray path 
Gk = (Pfc, Qk)- 

We have therefore constructed a Gray path for spider 1, proving that the spider- 
squishing problem has a solution when the underlying digraph is connected. To complete 
the construction for the general case, we can artificially ensure that the graph is connected 
by introducing a new vertex 0, with arcs from to the roots of the components. Then 
P will be the desired Gray path, if we suppress bit ao (which is zero throughout Pq). 
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5. Implementation via coroutines. By constructing families of sets Uk and Vk and 
identifying principal vertices in those sets, we have shown the existence of a Gray path for 
any given spider-squishing problem. Now let's make the proof explicit by constructing a 
family of coroutines that will generate the successive patterns a\ . . . a n dynamically, as in 
the examples worked out in Sections 1-3 above. 

First let's consider a basic substitution or "plug-in" operation that applies to corou- 
tines of the type we are using. Consider the following coroutines X and Y: 

Boolean coroutine X; 
while true do begin 

while A do return true; 

return false; 

while B do return false; 

if C then return true; 

end; 

Boolean coroutine Y; 
while true do begin 

while X do return true; 

return Z; 

end. 

Here X is a more-or-less random coroutine that invokes three coroutines A, B,C; coroutine 

Y has a special structure that invokes X and an arbitrary coroutine Z ^ X, Y. Clearly 

Y carries out essentially the same actions as the slightly faster coroutine XZ that we get 
from X by substituting Z wherever X returns false: 

Boolean coroutine XZ; 
while true do begin 

while A do return true; 
return Z; 

while B do return Z; 
if C then return true; 
end. 

This plug-in principle applies in the same way whenever all return statements of X 
are either 'return true'' or 'return false\ And we could cast XZ into this same mold, if 
desired, by writing 'if Z then return true else return false' in place of 'return Z\ 

In general we want to work with coroutines whose actions produce infinite sequences 
cki, «2, ... of period length 2M, where (ctM, • • • , «2M-i) is the reverse of (cto, ■ ■ ■ ■> &m-i)> 
and where the coroutine returns false after producing a t if and only if t is a multiple of M. 
The proof at the end of Section 2 shows that a construction like coroutine Y above, namely 

Boolean coroutine AtimesB; 
while true do begin 

while B do return true; 

return A; 

end 
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yields a coroutine that produces such sequences of period length 2MN from coroutines A 
and B of period lengths 2M and 2iV, when A and B affect disjoint bit positions of the 
output sequences. 

The following somewhat analogous coroutine produces such sequences of period length 
2(M + N): 

Boolean coroutine AplusB; 
while true do begin 

while A do return true; 
a[l] := 1; return true; 
while B do return true; 
return false; 
while B do return true; 
a[l] := 0; return true; 
while A do return true; 
return false; 
end. 

This construction assumes that A and B individually generate reflective periodic sequences 
a and (3 on bits . . . a n , and that «m = Po- The first half of AplusB produces 

0a , 0a M -i, l/?o, 1/?jv-i, 

and returns false after forming 1(3 n (which equals 1(3 n-i)- The second half produces the 
n-tuples 

1(3ni I/^at-i, OctM, 0a 2 M-i, 

which are the first M + N outputs in reverse; then it returns false, after forming 0«2M 
(which equals 0«o). 

The coroutines that we need to implement spider squishing can be built up from 
variants of the primitive constructions for product and sum just mentioned. Consider 
the following coroutines gen[l], . . . , gen[n], each of which receives an integer parameter / 
whenever being invoked: 

Boolean coroutine gen[k](l); integer /; 
while true do begin 

awakeO: if maxu[k] ^ then while gen[maxu[k]](k) do return true; 

a[k] := 1; return true; 
asleepl: if maxv[k] ^ then while gen[maxv[k]](k) do return true; 

if prev[k] > I then return gen[prev[k]](l) else return false; 
awakel: if maxv[k] ^ then while gen[maxv[k]](k) do return true; 

a[k] := 0; return true; 
asleepO: if maxu[k] ^ then while gen[maxu[k]](k) do return true; 

if prev[k] > I then return gen[prev[k]](l) else return false; 

end. 
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Here maxu[k] denotes the largest element of Uk U {0}, and prev[k] is a function that we 
shall define momentarily. This function, like the sets Uk and Yk, is statically determined 
from the given totally acyclic digraph. 

The idea of 'prev' is that all elements of Ui can be listed as u, prev[u], prev [prev[u]\ , 
. . . , until reaching an element < /, if we start with u = maxu[l]. Similarly, all elements 
of Vi can be listed as v, prev[v], prev [prev[v]\ , . . . , while those elements exceed /, starting 
with v = maxv [I] . The basic meaning of gen [k] with parameter / is to run through all bit 
patterns for the spiders u < k in Ui, if k is a positive vertex, or for the spiders v < k in Vj, 
if vertex k is negative. 

The example spider of Section 4 will help clarify the situation. The following ta- 
ble shows the sets Uk, Vk, and a suitable function prev[k], together with some auxiliary 
functions by which prev [k] can be determined in general: 
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If it is a positive vertex, not a root, let v\ be the parent of u. Then if v\ is negative, 
let i>2 be the parent of v±, and continue in this manner until reaching a positive vertex u t , 
the nearest positive ancestor of ui. We call the positive progenitor of ui, denoted 
ppro(vi). The main point of this construction is that u G Uk if and only if k is one of the 
vertices {vi, v<i, ■ ■ ■ , v t }. Consequently 

Uk = Ui n {k, k + 1, . . . , scope(A;)} 

if I is the positive progenitor of k. Furthermore Uk and Uk> are disjoint whenever k and k' 
are distinct positive vertices. Therefore we can define prev[u] for all positive nonroots u 
as the largest element less than u in the set Uk U {0}, where k = ppro (parent (it)) is the 
positive progenitor of it's parent. 

Every element also has a negative progenitor, if we regard the dummy vertex as a 
negative vertex that is parent to all the roots of the digraph. Thus we define prev [v] for all 
negative v as the largest element less than v in the set VjtU{0}, where k = npro (parent (i> )). 

Notice that 9 is an element of both U\ and U& in the example spider, so both ^en[9](l) 
and gen [9] (8) will be invoked at various times. The former will invoke (/en[6](l), which 
will invoke <?en[2](l); the latter, however, will merely flip bit ag on and off, because prev[9] 
does not exceed 8. There is only one coroutine gen [9]; its parameter / is reassigned each 
time gen [9] is invoked. (The two usages do not conflict, because gen[9](l) is invoked only 
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when a\ = 0, in which case a§ = and gen [8] cannot be active.) Similarly, gen [4] can 
be invoked with I = 1,2, or 3; but in this case there is no difference in behavior because 
prei>[4] = 0. 

In order to see why gen[k] works, let's consider first what would happen if its param- 
eter / were oo, so that the test l prev[k] > V would always be false. In such a case gen[k] is 
simply the AplusB construction applied to A = gen[maxu[k]](k) and B — gen[maxv[k]](k). 

On the other hand when I is set to a number such that k e Ui or k e Vi, the coroutine 
gen[k] is essentially the AtimesB construction, because it results when Z = gen[prev[k]](l) 
is plugged in to the instance of AplusB that we've just discussed. The effect is to obtain 
the Cartesian product of the sequence generated with / = oo and the sequence generated 
by gen[prev[k]](l). 

Thus we see that 'if maxu [k] ^ then while gen [maxu [k]] (k) do return true ' gener- 
ates the sequence Pk described in Section 4, and 'if maxv ^ then while gen[maxv [k]](k) 
do return true y generates Qk- It follows that gen[k](oo) generates the Gray path Gk- 
And we get the overall solution to our problem, path Po, by invoking the root coroutine 
gen[maxu[0]](0) . 

Well, there is one hitch: Every time the AplusB construction is used, we must be sure 
that coroutines A and B have been set up so that the last pattern of A equals the first 
pattern of B. We shall deal with that problem in Section 6. 

In the unconstrained case, when the given digraph has no arcs whatsoever, we have 
Uq = {1, . . . , n} and all other £7's and V's are empty. Thus prev[k] = k — 1 for 1 < k < n, 
and gen[k](0) reduces to the coroutine poke[k] of Section 1. 

If the given digraph is the chain 1 —> 2 n, the nonempty U : s and V : s are 

Uk = {k + 1} for < k < n. Thus prev[k] = for all k, and gen[k](l) reduces to the 
coroutine bump[k] of Section 2. Similar remarks apply to cobump, mbump, and ebump. 

If the given digraph is the fence 1^2^3^4^---,we have Uk = {k'} and 
V k = {k"} for 1 < k < n, where (k', k") = (k + 1, k + 2) if k is odd, (k + 2, k + 1) if k is 
even, except that U n -i = if n is odd, V n _\ = if n is even. Also U = {1}. Therefore 
prev[k] = for all k, and gen[k](l) reduces to the coroutine nudge[k] of Section 3. 

6. Launching. Ever since 1968, Section 1.4.2 of The Art of Computer Programming 
[9] has contained the following remark: "Initialization of coroutines tends to be a little 
tricky, although not really difficult." Perhaps that statement needs to be amended, from 
the standpoint of the coroutines considered here. We need to decide at which label each 
coroutine gen[k] should begin execution when it is first invoked: awakeO, asleepl, awakel, 
or asleepO. And our discussion in Sections 3 and 4 shows that we also need to choose the 
initial setting of ai . . . a n very carefully. 

Let's consider the initialization of ai . . . a n first. The reflected Gray path mechanism 
that we use to construct the paths Pk and Qk, as explained in Section 4, complements 
some of the bits. If, for example, Uk = {u\, U2, ■ ■ ■ , u m }, where u\ < U2 < • • ■ < u m , path 
Pk will contain n Ul n U2 . . -n Um bit patterns, and the value of bit a Ui at the end of Pk will 
equal the value it had at the beginning if and only if n Ul n U2 . . .n Ui _ x is even. The reason 
is that subpath G Ui is traversed n Ul n U2 . . ■n Ui _ 1 times, alternately forward and backward. 
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In general, let 

Sjk = Yi Uu " if J e Uk > ^' fc = II nv " if J G ^ A: - 

ueu k vev k 

Let «jfc and ujjk be the initial and final values of bit aj in the Gray path Gk for spider k, 
and let rjk be the value of aj at the transition point (the end of Pk and the beginning 
of Qk)- Then akk = 0, cujtit = 1, and the construction in Section 4 defines the values of 
otik, T~ik, and Uik for k < i < scope(fc) as follows: Suppose % belongs to spider j, where j is 
a child of k. 

• If j is positive, so that j is a principal element of Uk, we have = Wy, since ends 
with a_j = 1. Also CKjfe = Uij if <5jfc is even, otik = ctij if $jk is odd. If ^* i we have 
cuifc = 1; otherwise % belongs to spider j', where j' is a nonprincipal element of Vk- In 
the latter case uiik = ony if Ufj + dj'k is even, otherwise uiik = LUij>- (This follows 
because ujy 3 - = Tyk and uyk = (i~j>k + $j'k) mod 2.) 

• If j is negative, so that j is a principal element of Vk, we have = aij, since Qfc 
begins with cij = 0. Also cu^ = otij if 5jfc is even, uiik = uJij if Sjk is odd. If z ^* /c 
we have = 0; otherwise % belongs to spider j', where j' is a nonprincipal element 
of Uk- In the latter case a^fc = if ctj'j + dj'k is even, otherwise = Wy'. 

For example, when the digraph is the spider of Section 4, these formulas yield 
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Suppose j is a negative child of k. If n u is odd for all elements u of Uk that are less 
than j, then Sij+Sik is even for all i & Uj, and it follows that = for j < i < scope(j). 
(If i is in spider j', where j' G Uj C then is a^/ or u;^' according as OLj> 3 - + dj'k 
is even or odd, and Ty is a^' or ufy-' according as ctj'j + 8j>j is even or odd; and we have 
Sj'k = Sj'j mod 2.) On the other hand, if n u is even for some u G Uk with w < j, then 
is even for all i G Uj, and we have = for j < i < scope (j). This observation makes 
it possible to compute the initial bits a\ . . .a n in 0(n) steps (see [13]). 

The special nature of vertex suggests that we define Sjo = 1 for 1 < j < n, because we 
use path Pq but not Qq. This convention makes each component of the digraph essentially 
independent. (Otherwise, for example, the initial setting of a\ . . . a n would be 01 ... 1 in 
the trivial "poke" case when the digraph has no arcs.) 

Once we know the initial bits, we start gen[k] at label awakeO if ajt = 0, at label 
awakel if = 1. 
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7. Optimization. The coroutines gen[l], gen[n] solve the general spider-squishing 
problem, but they might not run very fast. For example, the bump routine in Section 2 
takes an average of about n/2 steps to decide which bit should be changed. We would 
much prefer to use only a bounded amount of time per bit change, on the average, and 
this goal turns out to be achievable if we optimize the coroutine implementation. 

A brute-force implementation of the gen coroutines, using only standard features of 
Algol, can readily be written down based on an explicit stack and a switch declaration: 

Boolean val; comment the current value being returned; 

integer array stack[0 : 2 * n]; comment saved values of k and /; 

integer k, /, s; comment the current coroutine, parameter, and stack height; 

switch sw := pi, p2, p3, p4, p5, p6, p7, p8, p9, plO, pll; 

integer array pos [0 : n] ; comment coroutine positions; 

( Initialize everything ) ; 
pi: if maxu[k] ^ then begin 
invoke (marajfc], /c, 2); 
p2: if val then ret (I); 
end; 

a[k] := 1; val := true; ret(3); 
p3: if maxv [k] ^ then begin 
invoke (maxv [k],k,4); 
p4: if val then ret (3); 
end; 

if prev[k] > I then begin 

invoke (prev [k] , /, 5) ; 
p5: ret (6); 
end 

else begin val := false; ret(6); end; 
p6: if maxv [k] ^ then begin 

invoke ( maxv [k],k,7); 
p7: if val then ret (6); 
end; 

a[k] := 0; val := true; ret (8); 
p8: if maxu[k] ^ then begin 
invoke(maxu [k], k, 9); 
p9: if val then ret (8); 
end; 

if prev[k] > I then begin 

invoke (prev [k] , /, 10) ; 
plO: ret{l); 
end 

else begin val := false; ret(l); end; 
pll: ( Actions of the driver program when k = ); 
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Here invoke(newk, newl,j) is an abbreviation for 

pos[k] := j; stack [s] := k; stack[s + 1] := I; s := s + 2; 
k := newk; I := newl; go to sw[pos[/c]] 

and ret(j) is an abbreviation for 

pos [k] := j; s := s — 2; 

/ := stack [s + 1]; k := stack[s]; go to siu[pos[fc]]. 

We can streamline the brute-force implementation in several straightforward ways. 
First we can use a well-known technique to simplify the "tail recursion" that occurs 
when invoke is immediately followed by ret (see [11, example 6a]): The statements 
1 invoke (prev[k], I, 5); p5: re£(6)' can, for example, be replaced by 

pos[k] := 6; k := prev[k]; go to sw[pos[fc]]. 

An analogous simplification is possible for the constructions of the form 'while A 
do return true'' that occur in gen[k]. For example, we could set things up so that 
coroutine A removes two pairs of items from the stack when it returns with vol = true , if 
we first set pos [k] to the index of a label that follows the while statement. More generally, 
if coroutine A itself is also performing such a while statement, we could allow return 
statements to remove even more than two pairs of stack items at a time. Details are left 
to the reader. 

8. The active list. The gen coroutines of Section 5 perform 0(n) operations per bit 
change, as they pass signals back and forth, because each coroutine carries out at most 
two lines of its program. This upper bound on the running time cannot be substantially 
improved, in general. For example, the bump coroutines of Section 2 typically need to 
interrogate about \n trolls per step; and it can be shown that the nudge coroutines of 
Section 3 typically involve action by about cn trolls per step, where c = (5 + v / 5)/10 ~ 
0.724. (See [9, exercise 1.2.8-12].) 

Using techniques like those of Section 7, however, the gen coroutines can always be 
transformed into a procedure that performs only 0(1) operations per bit change, amortized 
over all the changes. A formal derivation of such a transformation is beyond the scope of 
the present paper, but we will be able to envision it by considering an informal description 
of the algorithm that results. 

The key idea is the concept of an active list, which encapsulates a given stage of the 
computation. The active list is a sequence of nodes that are either awake or asleep. If j 
is a positive child of k, node j is in the active list if and only if k = or = 0; if j is a 
negative child of k, it is in the active list if and only if = 1. 

Examples of the active list in special cases have appeared in the tables illustrating 
bump in Section 2 and nudge in Section 3. Readers who wish to review those examples 
will find that the numbers listed there do indeed satisfy these criteria. Furthermore, a 
node number has been underlined when that node is asleep; bit aj has been underlined if 
and only if j is asleep and in the active list. 
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Initially a\ . . . a n is set to its starting pattern as denned in Section 6, and all elements 
of the corresponding active list are awake. To get to the next bit pattern, we perform the 
following actions: 

1) Let k be the largest nonsleeping node on the active list, and wake up all nodes that 
are larger. (If all elements of the active list are asleep, they all wake up and no bit 
change is made; this case corresponds to gen[maxu [0]](0) returning false.) 

2) If a,k = 0, set a,k to 1, delete fc's positive children from the active list, and insert fc's 
negative children. Otherwise set to 0, insert the positive children, and delete the 
negative ones. (Newly inserted nodes are awake.) 

3) Put node k to sleep. 

Again the reader will find that the bump and nudge examples adhere to this discipline. 

If we maintain the active list in order of its nodes, the amortized cost of these three 
operations is 0(1), because we can charge the cost of inserting, deleting, and awakening 
node k to the time when bit cik changes. Steps (1) and (2) might occasionally need to do 
a lot of work, but this argument proves that such difficult transitions must be rare. 

Let's consider the spider of Section 4 one last time. The 60 bit patterns that satisfy 
its constraints are generated by starting with a\ . . . a 9 = 000001100, as we observed in 
Section 6, and the Gray path Gi begins as follows according to the active list protocol: 
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(Notice how node 7 becomes temporarily inactive when a§ becomes 0.) The most dramatic 
change will occur after the first r^r^ng = 48 patterns, when bit a\ changes as we proceed 
from path Pi to path Q\\ 

011011100 124679 
111011100 14789 

(The positive children 2 and 6 have been replaced by the negative child 8.) Finally, after 
all 60 patterns have been generated, the active list will be 14789 and a\ . . . ag will be 
111111100. All active nodes will be napping, but when we wake them up they will be 
ready to regenerate the 60 patterns in reverse order. 

It should be clear from these examples, and from a careful examination of the gen 
coroutines, that steps (1), (2), and (3) faithfully implement those coroutines in an efficient 
iterative manner. 

9. Additional optimizations. The algorithm of Section 8 can often be streamlined 
further. For example, if j and j' are consecutive positive children of k and if Vj is empty, 
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then j and j' will be adjacent in the active list whenever they are inserted or deleted. 
We can therefore insert or delete an entire family en masse, in the special case that all 
nodes are positive, if the active list is doubly linked. This important special case was first 
considered by Koda and Ruskey [14]; see also [12, Algorithm 7. 2.1. IK]. 

Further tricks can in fact be employed to make the active list algorithm entirely 
loopless, in the sense that 0(1) operations are performed between successive bit changes 
in all cases — not only in an average, amortized sense. One idea, used by Koda and 
Ruskey in the special case just mentioned, is to use "focus pointers" to identify the largest 
nonsleeping node (see [7] and [12, Algorithm 7.2.1.1L]). Another idea, which appears to 
be necessary when both positive and negative nodes appear in a complex family, is to 
perform lazy updates to the active list, changing links only gradually but before they are 
actually needed. Such a loopless implementation, which moreover needs only 0(n) steps 
to initialize all the data structures, is described fully in [13]. It does not necessarily run 
faster than a more straightforward amortized 0(1) algorithm, from the standpoint of total 
time on a sequential computer; but it does prove that a strong performance guarantee is 
achievable, given any totally acyclic digraph. 

10. Conclusions and acknowledgements. We have seen that a systematic use of co- 
operating coroutines leads to a generalized Gray code for generating all bit patterns that 
satisfy the ordering constraints of any totally acyclic digraph. Furthermore those corou- 
tines can be implemented efficiently, yielding an algorithm that is faster than previously 
known methods for that problem. Indeed, the algorithm is optimum, in the sense that its 
running time is linear in the number of outputs. 

Further work is clearly suggested in the heretofore neglected area of coroutine trans- 
formation. For example, we have not discussed the implementation of coroutines such as 

Boolean coroutine copoke [k] ; 
while true do begin 

if k < n then while copoke[k + 1] do return true; 

a[k] := 1 — a[k]; return true; 

if k < n then while copoke[k + 1] do return true; 

return false; 

end. 

These coroutines, which are to be driven by repeatedly calling copoke[l], generate Gray 
binary code, so their effect is identical to repeated calls on the coroutine poke[n] in Sec- 
tion 2. But copoke is much less efficient, since copoke[l] always invokes copoke[2], 
copoke [n] before returning a result. Although these copoke coroutines look superficially 
similar to gen, they are not actually a special case of that construction. A rather large 
family of coroutine optimizations seems to be waiting to be discovered and to be treated 
formally. 

Another important open problem is to discover a method that generates the bit pat- 
terns corresponding to an arbitrary acyclic digraph, with an amortized cost of only O(l) 
per pattern. The best currently known bound is O(logn), due to M. B. Squire [17]; see 
also [16, Section 4.11.2]. There is always a listing of the relevant bit patterns in which at 
most two bits change from one pattern to the next [15, Corollary 1]. 
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